Three-Phase Text Error Correction Model for Korean SMS Messages

نویسندگان

  • Jeunghyun Byun
  • So-Young Park
  • Seung-Wook Lee
  • Hae-Chang Rim
چکیده

In this paper, we propose a three-phase text error correction model consisting of a word spacing error correction phase, a syllablebased spelling error correction phase, and a word-based spelling error correction phase. In order to reduce the text error correction complexity, the proposed model corrects text errors step by step. With the aim of correcting word spacing errors, spelling errors, and mixed errors in SMS messages, the proposed model tries to separately manage the word spacing error correction phase and the spelling error correction phase. For the purpose of utilizing both the syllable-based approach covering various errors and the word-based approach correcting some specific errors accurately, the proposed model subdivides the spelling error correction phase into the syllable-based phase and the word-based phase. Experimental results show that the proposed model can improve the performance by solving the text error correction problem based on the divide-and-conquer strategy. key words: text error correction, word spacing errors, spelling errors, SMS messages

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two Phase Model for SMS Text Messages Refinemen

In this paper, we propose a new model for refining SMS text messages where two different kinds of grammatical errors frequently occur together. A two-phase approach based on the divide and conquer strategy is presented where HMM-based model is used for correcting spacing errors in the first phase, and rule-based correction model is used for correcting spelling errors in the second phase. Experi...

متن کامل

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...

متن کامل

بررسی تاثیر سرویس پیام کوتاه تلفن همراه (SMS) بر خودمراقبتی دیابت

Background: The objective of the current study is to assess the effectiveness of Mobile Short Message Service (SMS) intervention on education of basic self-care skills in patients with type 2 diabetes. Moreover, we aimed to determine whether delivering individually-tailored educational messages can be more effective than general educational messages. Methods: A total of 150 patients with dia...

متن کامل

Processing Informal, Romanized Pakistani Text Messages

Regardless of language, the standard character set for text messages (SMS) and many other social media platforms is the Roman alphabet. There are romanization conventions for some character sets, but they are used inconsistently in informal text, such as SMS. In this work, we convert informal, romanized Urdu messages into the native Arabic script and normalize non-standard SMS language. Doing s...

متن کامل

Normalizing Microtext

The use of computer mediated communication has resulted in a new form of written text—Microtext—which is very different from well-written text. Tweets and SMS messages, which have limited length and may contain misspellings, slang, or abbreviations, are two typical examples of microtext. Microtext poses new challenges to standard natural language processing tools which are usually designed for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 92-D  شماره 

صفحات  -

تاریخ انتشار 2009